This zipfile contains data, codebook, programs, and output using the de-identified data to accompany, "Race, Ethnicity, and NIH Awards."  Here we define the following files:

1.  Science_NIH_deidentified.dta.  STATA 11 Dataset containing selected de-identified 	variables.

2.  Science_NIH_deidentified.csv.  ASCII delimited Dataset containing selected de-identified 	variables.

3.  NIH_Deidentified_codebook.pdf.  Codebook with variable definitions and tabulations of the 	data.

4.  Science_descriptive_deident1.do.  STATA program that generates the estimates for versions 	of Figure 1 and Tables S2, S11, S12, and S13 using the de-identified data.

5.  Science_probit_deidentified1.do.  STATA program that estimates the probit models for 	versions of Table S5, Figure 3, Table S14, and Table S15 using the de-identified 	data.

6.  Science_deidentified_tables.xls.  Excel spreadsheet containing versions of Tables S2, S5,  	S11, S12, S13, S14 and S15 using the de-identified data.

Questions about replicating the findings using the de-identified data and STATA programs may be addressed to Donna Ginther (dginther@ku.edu). 